Skip to main content

ClickHouse

Querying Overview

ClickHouse uses structured query language (SQL) for managing and querying data. ClickHouse supports a wide variety of data types including JSON and the cloud version support vectors. A general reference for querying can be found at
https://clickhouse.com/docs/sql-reference

Information on the many SQL functions is at https://clickhouse.com/docs/sql-reference/functions

Datatypes

ClickHouse supports a variety of data types. Many of these map into reasonable JavaScript data types. Details on the data types can be found at https://clickhouse.com/docs/sql-reference/data-types

Date and Timestamp Handling

ClickHouse does not automatically convert database date/time types to JavaScript Date objects. Qarbine automatically converts date oriented values into JavaScript Date objects. To turn this automatic conversion off for a query add the following line to the query specification.

#pragma noDateConversion

When “on” the ClickHouse streaming feature is not available, in which case if there is no LIMIT clause then the complete answer set is retrieved prior to applying any soft maximum number of answer set rows.

Handling JSON Data

ClickHouse supports a JSON datatype that when retrieved by Qarbine is a genuine JavaScript object (versus a string). Legacy SQL centric tools have to handle the string format which typically results in a large burden on whomever or whatever is analyzing the data. With Qarbine the data is in its easy to interact with natural form- no coding necessary to handle dynamic JSON data structures.

Information on functions to interact with JSON data is at https://clickhouse.com/docs/sql-reference/functions/json-functions

Vector Queries

Vector searches can be used to find “similar” content based on an embedding array. The array provides semantic context necessary to perform the search.The notion of “distance” is used to determine similarity. There are a variety of distance functions available as discussed at https://clickhouse.com/docs/sql-reference/functions/distance-functions

A sample query specification is shown below. It references a Qarbine variable with the embedding array value.

SELECT title, plot, year, director,
L2Distance(embedding, [! @embedding!] ) AS score
FROM movie_plots
ORDER BY score ASC
LIMIT 10

Another technique for the distance criteria is shown below

 L2Distance(embedding, [! embedding(@searchPhrase, "myAiAssistant") !] ) AS score

It dynamically obtains the embedding array for the text contained in the searchPhrase variable using a configured Qarbine AI Assistant.

For more information see https://clickhouse.com/docs/knowledgebase/vector-search

Qarbine Virtual Queries

There are a few convenience queries which are mainly DBA oriented. These queries are recognized by the Qarbine driver and provide common database information. Any catalog and schema set in the data service definition constrains what is returned. For example, if a catalog is given in the data service, then only schemas in that one catalog are returned.

These virtual query defaults are independent of whatever drop down option is chosen in the Data Source Designer tool. If a specific schema’s information is wanted for example, it must be explicitly given.

Query Description
list databasesReturn a list of visible databases.
list tables [DATABASE]Return a list of tables. The optional argument may be a database name.
describe tables [DATABASE]Provide details on all of the tables. The optional argument may be of the form “schema” or “catalog.schema”. This may take a while depending on your database structure.
describe table TABLEProvide details on the given table.

See the “DBA Productivity” section of the online documentation for more details.

Troubleshooting

If errors occur while using Qarbine then a good course of action is to run the query using the ClickHouse online SQL console. After logging into the console click

  

Information on using the ClickHouse SQL console is at
https://clickhouse.com/docs/cloud/get-started/sql-console